Skip to content

Implement bot detection and verification#362

Merged
jazairi merged 4 commits intomainfrom
use-410-turnstile-gem
Mar 9, 2026
Merged

Implement bot detection and verification#362
jazairi merged 4 commits intomainfrom
use-410-turnstile-gem

Conversation

@jazairi
Copy link
Contributor

@jazairi jazairi commented Feb 26, 2026

Why these changes are being introduced:

We need a preliminary solution to manage bot traffic in production.

Relevant ticket(s):

How this addresses that need:

  • Implements a Bot Detector service that uses Crawler Detect to identify bots to refer to Cloudflare Turnstile.
  • Implements a Turnstile Controller using the rails-cloudfire-turnstile gem for verification challenges.

Side effects of this change:

The new form may need additional styling. For now, I just added our button styles.

Developer

Accessibility
  • ANDI or WAVE has been run in accordance to our guide.
  • This PR contains no changes to the view layer.
  • New issues flagged by ANDI or WAVE have been resolved.
  • New issues flagged by ANDI or WAVE have been ticketed (link in the Pull Request details above).
  • No new accessibility issues have been flagged.
New ENV
  • All new ENV is documented in README.
  • All new ENV has been added to Heroku Pipeline, Staging and Prod. (Added to GeoData and TIMDEX UI prod only)
  • ENV has not changed.
Approval beyond code review
  • UXWS/stakeholder approval has been confirmed.
  • UXWS/stakeholder review will be completed retroactively.
  • UXWS/stakeholder review is not needed.
Additional context needed to review

This introduces rails-cloudflare-turnstile and crawler_detect as dependencies.

Code Reviewer

Code
  • I have confirmed that the code works as intended.
  • Any CodeClimate issues have been fixed or confirmed as
    added technical debt.
Documentation
  • The commit message is clear and follows our guidelines
    (not just this pull request message).
  • The documentation has been updated or is unnecessary.
  • New dependencies are appropriate or there were no changes.
Testing
  • There are appropriate tests covering any new functionality.
  • No additional test coverage is required.

@jazairi jazairi force-pushed the use-410-turnstile-gem branch from d4f048e to 460d738 Compare February 26, 2026 17:09
@jazairi jazairi had a problem deploying to timdex-ui-pi-use-410-tu-bsn4kh February 26, 2026 17:09 Failure
@jazairi
Copy link
Contributor Author

jazairi commented Feb 26, 2026

Looks like the coveralls outage is ongoing. I'd like to wait until that's resolved to merge, but I think we can start code review in the meantime.

@JPrevost JPrevost self-assigned this Feb 27, 2026
Copy link
Member

@JPrevost JPrevost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you configure the PR build to have this feature enabled and functioning? I want to easily be able to confirm my normal user agent is not blocked from anything but if I change my user agent to a known bot I should be challenged and then be able to pass the challenge to continue.

private

def ensure_bot_detection_enabled
head :not_found unless Feature.enabled?(:bot_detection)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are assuming this should never happen, maybe we should log to Sentry if it does? My assumption is this is handling a case in which a user agent is requesting the turnstile route even though we have it disabled which should not happen under normal circumstances.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the idea, but now I'm wondering if we even care whether user agents are requesting the turnstile route. I might just drop this for clarity's sake.

@qltysh
Copy link

qltysh bot commented Mar 2, 2026

❌ 2 blocking issues (2 total)

Tool Category Rule Count
rubocop Lint Class has too many lines. [289/100] 1
rubocop Style Line is too long. [121/120] 1

class Feature
# List of all valid features in the application
VALID_FEATURES = %i[geodata boolean_picker oa_always simulate_search_latency tab_primo_all tab_timdex_all
VALID_FEATURES = %i[bot_detection geodata boolean_picker oa_always simulate_search_latency tab_primo_all tab_timdex_all
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line is too long. [121/120] [rubocop:Layout/LineLength]

@jazairi jazairi had a problem deploying to timdex-ui-pi-use-410-tu-bsn4kh March 2, 2026 18:09 Failure
jazairi added 2 commits March 2, 2026 10:11
Why these changes are being introduced:

We need a preliminary solution to manage bot traffic in production.

Relevant ticket(s):

- https://mitlibraries.atlassian.net/browse/USE-410

How this addresses that need:

- Implements a Bot Detector service that uses Crawler Detect to identify
bots to refer to Cloudflare Turnstile.
- Implements a Turnstile Controller using the `rails-cloudfire-turnstile`
gem for verification challenges.

Side effects of this change:

The new form may need additional styling. For now, I just added our
button styles.
@jazairi jazairi force-pushed the use-410-turnstile-gem branch from 76f6402 to fb87d65 Compare March 2, 2026 18:39
@jazairi jazairi temporarily deployed to timdex-ui-pi-use-410-tu-bsn4kh March 2, 2026 18:41 Inactive
@coveralls
Copy link

coveralls commented Mar 2, 2026

Pull Request Test Coverage Report for Build 22869511702

Details

  • 37 of 37 (100.0%) changed or added relevant lines in 4 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.05%) to 98.188%

Totals Coverage Status
Change from base Build 22866584976: 0.05%
Covered Lines: 1355
Relevant Lines: 1380

💛 - Coveralls

@jazairi jazairi force-pushed the use-410-turnstile-gem branch from fb87d65 to 97274a2 Compare March 2, 2026 18:41
@jazairi jazairi temporarily deployed to timdex-ui-pi-use-410-tu-bsn4kh March 2, 2026 18:43 Inactive
@jazairi jazairi temporarily deployed to timdex-ui-pi-use-410-tu-bsn4kh March 2, 2026 18:46 Inactive
@jazairi jazairi force-pushed the use-410-turnstile-gem branch from 42b13a4 to 5c844c9 Compare March 2, 2026 19:32
@jazairi jazairi temporarily deployed to timdex-ui-pi-use-410-tu-bsn4kh March 2, 2026 19:33 Inactive
@jazairi jazairi force-pushed the use-410-turnstile-gem branch from 5c844c9 to f04ae32 Compare March 2, 2026 19:36
@jazairi jazairi temporarily deployed to timdex-ui-pi-use-410-tu-bsn4kh March 2, 2026 19:36 Inactive
- Explicity require Feature model in initializer
- Update CrawlerDetect method name
- Remove trailing whitespace from Bot Detector test
- Switch from OpenStruct to Struct
- Update test assertions

Test Turnstile root path fallback and fix typo in CrawlerDetect instantiation
@jazairi jazairi force-pushed the use-410-turnstile-gem branch from f04ae32 to bf6be71 Compare March 2, 2026 19:37
@jazairi jazairi temporarily deployed to timdex-ui-pi-use-410-tu-bsn4kh March 2, 2026 19:37 Inactive
@jazairi
Copy link
Contributor Author

jazairi commented Mar 2, 2026

@JPrevost I think this is ready for review again. I pushed two separate commits, one two respond to code review feedback, and the other for linting and cleanup. When I enabled the feature in the PR build, I learned that my method name for CrawlerDetect was wrong, so that's fixed. 🤦 I'd flipped the bots? boolean to test locally, and all the stubs had the wrong name, too -- a good case for VCR, perhaps.

In any case, here's how I've tested the feature:

  1. Set my UA to Googlebot in dev tools.
  2. Run a search and confirm that the challenge view shows.
  3. Update your UA to something non-bot and confirm that you can pass the challenge. (In my experience, it does not let you pass the challenge if using a bot UA.)

Copy link
Member

@JPrevost JPrevost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there may still be an file remnant from the initial implementation?

Some of the other comments can either be discussed, adjusted, or opened as follow-on tickets depending on how you'd prefer to address them.

@jazairi jazairi force-pushed the use-410-turnstile-gem branch from 1cb7ab3 to 5e46548 Compare March 9, 2026 18:16
@jazairi jazairi requested a review from JPrevost March 9, 2026 18:16
@jazairi
Copy link
Contributor Author

jazairi commented Mar 9, 2026

@JPrevost Thanks for the detailed review on this! Let me know if I missed anything on the latest commit. I took an initial pass at reviewing the language, but I'm not sure if UXWS should get eyes on it as well?

@jazairi jazairi force-pushed the use-410-turnstile-gem branch from 5e46548 to 94a2123 Compare March 9, 2026 18:19
Copy link
Member

@JPrevost JPrevost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

Please open a ticket with screenshots for UX to review the language once we land this. It can be a backlog ticket as the wording seems good enough for now.

Copy link
Member

@JPrevost JPrevost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for overriding my last review.

Please check if the CI failure is real before we merge.

- Clean up unused view
- Avoid mocks for bot detection
- Additional guardrails for initializer
- Soften language
@jazairi jazairi force-pushed the use-410-turnstile-gem branch from 94a2123 to a01ead6 Compare March 9, 2026 18:50
Copy link
Member

@JPrevost JPrevost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit: See previous note about a ticket for UX with screenshots to review the language. Thanks for your patience on the review!

@jazairi jazairi merged commit 0ef2f65 into main Mar 9, 2026
6 checks passed
@jazairi jazairi deleted the use-410-turnstile-gem branch March 9, 2026 19:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants